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Abstract 

This paper takes a critical look at the usefulness of power law models of the 
Internet. The twin focuses of the paper are Internet traffic and topology 
generation. The aim of the paper is twofold. Firstly it summarises the state 
of the art in power law modelling particularly giving attention to existing 
open research questions. Secondly it provides insight into the failings of such 
models and where progress needs to be made for power law research to feed 
through to actual improvements in network performance. 
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1. Introduction 

Power laws describe a wide range of phenomena in nature and a large body 
of ongoing research investigates their applicability in fields such as computer 
science, physics, biology, social sciences and economics. Power law distribu- 
tions are characterised by a slower than exponentially decaying probability 
tail, which loosely means that large values can occur with a non-negligible 
probability (see the next section for formal definitions). They can be used 
to characterise a variety of relations such as for example the distribution of 
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income, city population, citations of scientific papers, word frequencies, com- 
puter file sizes and the number of daily hits to a given website. See |l| and 
[2] and references therein for further examples. 

The aim of this paper is not to be a general survey of power laws in net- 
works but instead to be a critical look at open questions and the outcome 
of such research, in particular with regard to the question "How can power 
law modelling improve network performance?" The paper looks at two sep- 
arate areas where power law research has been of interest in the Internet. 
The study of power laws in the analysis of Internet traffic characteristics has 
been ongoing since 1993 and in Internet topology generation since 1999. 

In 1993, the seminal paper (expanded in jij]) found evidence of the 
existence of power law relationships in network traffic by observing long- 
range correlation in Local Area Network (LAN) traffic. This brought the 
concept of self-similarity, and the related concept of Long- Range Dependence 
(LRD), into the field of network traffic and performance analysis. Before 
this finding, network traffic and performance studies had been mainly based 
on models, such as Poisson processes, which assume that traffic exhibits no 
long-term correlation. In networks with long-range correlated traffic, queuing 
performance can very different to that of traffic assumed independent or only 
having short-term correlations. Subsequently power law relationships have 
been observed in several other contexts on many different types of network. 

In 1999 it was also discovered that the global Internet structure is char- 
acterised by a power law That is, the probability distribution of a node's 
connectivity (measured for example by the number of BGP peering relations 
that an autonomous system has) follows a power law. This discovery in- 
validated previous Internet models that were based on the classical random 
graphs. Since then a lot of efforts have been put into studying the Internet 
power law structure 0, Q, fi 0, 0, Q H E3, 



14, 15 



This paper reviews the measurements and models of the Internet topol- 
ogy, and comments upon whether the power law is in itself an adequate 
characterisation of the system. It questions whether models based on power 
laws provide a suitable platform for theoretical and simulation analysis of the 
Internet's traffic and topological characteristics. Finally, it provides discus- 
sion of how such research could be of use in improving network performance 
(which, after all, should be the ultimate goal of networking research). 

The structure of the paper is as follows. Section 11.11 provides the ba- 
sic mathematical definitions used throughout the paper: heavy-tailed dis- 
tributions, long-range dependence and statistical self-similarity. Section [2] 
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describes the use of power law relationships to model the statistical nature 
of Internet traffic. Section [3] discusses "scale-free networks" a power law 
relationship which describes the connectivity of networks. 

1.1. Basic definitions 

In the sense meant in this paper, a power law relationship is a function, 
f(x) with the form f(x) = ax@ where a and (3 are non zero constants. Several 
relationships of interest in the Internet have been shown to have this form 
asymptotically (usually as x — > oo). 

Definition 1. A random variable X (which may be continuous or discrete) 
is said to have a heavy-tailed distribution if it satisfies 

P [X > x] e £X —> oo, as x — > oo. 

Often a specific power law form is assumed for the distribution: 

P [X > x] ~ Cx~ a , 

for some C > and some a £ (0,2). The symbol ~ here and for the 
rest of this paper means asymptotically equal to, that is f(x) ~ (f>(x) <-> 
f(x)/4>(x) — > 1 as x — > oo (or occasionally, some other limit). 

Definition 2. Let {Xi, X 2 , . . .} be a time series. The series is weakly- 
stationary if it has a constant and finite mean (E[Xi] = \i for all i, where 
E means expectation) and for all i,j £ N the covariance between Xi and Xj 
(i.e. E[(Xi — fj)(Xj — fi)\) depends only on \j — i\. 

Weak stationarity is assumed for much that follows but in practice is not 
met by real network traffic over all timescales (for example, over a sufficiently 
long time the mean traffic level is not stationary, it varies with daily and 



weekly periodicity) and may not be met at all 

nana. 

If the time series is weakly-stationary then the Auto Correlation Function 
(ACF) p(k) is given by 

= E[(X t -n)(X t+k -n)] 
a 2 

where \i is the mean and a 2 is the variance. 

The ACF allows the definition of long-range dependence which is some- 
times called long memory or strong dependence. A standard reference on the 



topic is 18]. A commonly used definition is the following. 
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Definition 3. A weakly stationary time series is long-range dependent if 
the sum of the autocorrelation over all lags YlkLi diverges. 

Definition 4. The Hurst parameter is a commonly used measure of LRD. 
This makes the assumption that the ACF follows the specific functional form 

p(k) ~ C p k- a = C p k 2 - 2H , (1) 

where C p > and a G (0, 1) and H G (1/2, 1) is the Hurst parameter. 

Note that sometimes it is this and not Definition [3] which is taken as the 



definition of LRD. Other measures of LRD include Hurstiness [19|, Chapter 



8] and the 'strength' parameter used by [17 . 

LRD processes which meet Definition [3] but not Definition H] will have no 
well-defined Hurst parameter. The value H = 1/2 is usually taken to mean 
independent or short-range dependent data. Values of H G (0,1/2) are 
sometimes termed anti-long-range dependence. Values of H < or H > 1 



do not give useful models [18j, Section 2.3]. 

LRD can also be considered in the frequency domain. In this case, the 
characteristic of LRD is a pole in the spectral density (usually at zero). 

Definition 5. Let Y t be a stochastic process in continuous time t > 0. The 
process is exactly self-similar with self- similarity parameter H if for any 
choice of constant c > 0, the rescaled process c~ H Y ct is equal in distribu- 
tion to the original process Y t . 

Note that a similar definition can be given for discrete time stochastic process. 

Definition 6. Let Y t be a stochastic process and Y^ be the process derived 
from it by y} 171 ^ = ^ Si=t TO -(m-i) ^i- ^ process Y t is exactly second-order 
self-similar if, for all m, the process {m 1 ^ H Y} m ^} has the same variance and 
autocorrelation function as Y t . That is to say, for all k GN and m G N 7 



var Yt 



var ( 



k ) m 2 ~ 2H 



and 

P m (k)=p(k), (2) 
where p(.) is the ACF ofY t and p m {.) is the ACF ofY t {m) . 
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Second-order self-similarity can also be defined in terms of the second 
central difference operator [201 ] . A process Y t is asymptotically second-order 
self-similar if (jSJ) holds as k — > oo. 

Finally it remains to define scale-free (or power law) networks. 

Definition 7. Let G 6e an undirected graph. Let P^ be the probability that 
a randomly selected node in G has degree k. The graph G is scale-free if Pk 
(the node-degree distribution) is heavy-tailed if: 

P k ~ Ck~ a , 

where C > is a constant and a G (0, 2). 

Similar definitions can be constructed for a directed graph. The in-node 
degree distribution and the out-node degree distributions are treated sepa- 
rately in this case. 

A process which scales in a constant way is sometimes referred to as mono- 
fractal. A generalisation of this is a multi-fractal process which exhibits 



complex behaviour that changes over different timescales [21|. When the 
multi-fractal behaviour can be approximated by a combination of two (or a 
small number of) monofractals then the process is sometimes described as 
having monofractal behaviour at different timescales rather than multifractal 
behaviour. 

There are many connections to be made between these power laws, some 
more obvious than others. For example, a scale-free network is simply an ex- 
ample of a heavy-tailed distribution (as its node-degree distribution is heavy- 
tailed). 

One connection which is sometimes less than clear from the literature is 
that exact second-order self-similarity as in Definition [H] implies LRD of the 
form given by (flj). LRD of the form in ([I]) implies asymptotic self-similarity. 
The details of this relationship can be found in 20] and 22]. There is a 



more subtle connection between self-similarity and long-range dependence. 
If a self-similar process Y t has stationary increments and H e (0, 1) then 



it can be shown (see 18|, page 51]) that the increment process given by 
Xi = Yi- Yi- X for i e N has an ACF given by p(k) ~ H(2H - l)k 2H ~ 2 , 
which implies that for H e (1/2, 1) then the increment process is long-range 
dependent. 

The connection between heavy-tails and long-range dependence is more 
subtle. One such connection is [23j, Theorem 4.3] which states that in an 
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on/off process with heavy-tailed on periods and off periods which fall off 
faster is a long-range dependent process. Other connections between power 



laws can be found in [2J, |25] . These papers show that multiplexing a high 
number of independent on/off sources with heavy-tailed strictly alternating 
on and/or off periods gives rise to self-similarity. 

2. Power laws and Internet traffic 

Nearly fifteen years ago, the seminal paper Q found the existence of 
power law behaviour in Internet traffic. A time series describing LAN Ether- 
net packet traces at Bellcore showed evidence of second order self-similarity 
or long-range dependence. This paper for the first time questioned tradi- 
tional modelling assumptions and showed that existing models (often based 
on Poisson processes) would not correctly estimate important characteristics 
of a network. Since this paper, many hundreds of papers have been published 
about the power law behaviour of Internet traffic. A recent edition of the 
journal Performance Evaluation was devoted to this topic and the editorial 
describes modelling of LRD and heavy-tails as "One of the most important 
research topics in performance modelling and evaluation in the last decade" 



261. 



2.1. Measuring long-range dependence 

The Hurst parameter is often used as an estimate of traffic's LRD. This 
parameter however has to be used with prudence, as measuring traffic LRD 
and statistical self-similarity is a complex task which may be affected by many 
factors. Although, the estimation process can provide indication of the exis- 
tence of long-range dependent characteristics, it does not unequivocally prove 
the existence of authentic LRD, as these characteristics may simply be due 
to traffic non-stationarity. In the time domain, the estimation of the Hurst 
parameter is characterised by the fall off of the ACF at high lag. However, 
the high lag measurements are those at which the fewest readings are avail- 
able and the data is most unreliable. Similarly, in the frequency domain, the 
LRD is characterised by the behaviour of the spectrum at frequencies near 
zero, which are necessarily hard frequencies to measure. In terms of queuing 
performance, despite the common misconception, a high Hurst parameter 



does not always lead to worse performance or longer queues [27|. In fact, 
depending on the timescales of interest, traffic with a high Hurst parameter 
can lead to better performance than traffic with a low Hurst parameter. No 
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single Hurst parameter estimator can be considered infallible, as this can 



hide LRD when it exists or create it when it does not [281 ] . In addition, the 
Hurst parameter itself expresses the traffic scaling of the fluctuations around 
the mean and does not measure traffic burstiness. 

It is certain that simply examining the ACF is not a robust way to 
estimate the Hurst parameter. In addition, a number of biases may be 
present in real-life data which could cause problems. These include peri- 
odicity (users and processes daily usage patterns) and trends (traffic volume 
changes throughout the measurement period) which violate the assumption 
of weak-stationarity. The topic of measuring LRD is beyond the scope of 
this paper, the reader is referred to [29, 3(| for work which compares existing 
techniques. 

2.2. Evidence for and against power law behaviour in Internet traffic 

The original long-range dependence findings reported in 1993 have 
subsequently been replicated in many different studies. In 1995, Floyd and 
Paxson [3l| found that WAN traffic is also consistent with self-similar scaling. 
These findings have been confirmed in the late nineties in 32j, |33j . In partic- 
ular 



in 



32] the authors analyse WWW traffic and observe self-similarity in 



the patterns of recorded traffic and a heavy-tailed distribution in the sizes of 
the files transferred. They claim that heavy-tailed sizes of transferred files is 
the cause of the observed self-similarity. Also, in the late nineties, evidence 
was found that heavy-tailed distributions characterised a number of different 
measurements related to network traffic. In 34] the authors report on ob- 



servations of heavy-tailed distributions of file sizes on web servers and also 
of CPU time taken by processes. 

The paper 35| analyses WWW flow duration distribution at a lightly 
utilised academic campus Internet access. It finds that the tail of the flow 
duration distribution does not stabilise. The suggestion is that the best fit 
to the data is with a power law which varies in time. 

In 2005, [36| also investigated the power law behaviour of WWW traffic 
and found evidence of self-similarity over a number of timescales. 

The paper [l7| is sometimes cited as evidence that LRD is not an impor- 
tant property of Internet traffic. The data they analyse was collected in 2000 
on a 100 Mbps Bell Labs Ethernet link. Looking at inter-arrival times the au- 
thors find that when the traffic has more connections present the "strength" 
of the LRD is decreased. Note that this "strength" is not related to the Hurst 
parameter but could be considered analogous to the proportion of the traffic 
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which exhibits LRD. Their conclusion is that as the number of connections 
in the network increases the traffic will remain long-range dependent, but 
that the strength of the LRD will be weaker, and the arriving traffic will 
look more like a Poisson process. 

In 2003, observations were recorded on university access links [37|]. In 
the majority of the traces, the packet and byte count time series exhibit 
intermediate to heavy LRD, regardless of time of day or day of week. LRD is 
also found to be unaffected by traffic load and number of active connections. 
Therefore, in these access links, multiplexing of an increasing large number 
of TCP flows did not reduce correlation. 



2.3. Behaviour at different timescales 

Some authors have claimed that different scaling behaviour occurs at 
different time scales. This matter still seems to be an open research question. 
LRD and self similarity are both "monofractal" models in the sense that they 
assume a constant scaling behaviour over all time-scales. Strictly speaking 
asymptotic self similarity and LRD only imply this behaviour in the limit 
(at high lags or low frequencies). Multi-fractal modelling allows this scaling 
behaviour to change at each time scale considered. The topic of multi-fractal 
modelling is beyond the scope of this paper. For a good introduction see 



381 ] . Some authors have claimed that a multi- fractal approach is necessary 
to replicate the behaviour of Internet traffic. However, others have argued 
that this is not the case and a mixture of different monofractal scalings at 
different timescales is necessary. 



It has been argued (see [39|, |33j) that protocol mechanisms (such as the 



TCP feedback mechanism) have the greatest impact at smaller timescales. 
At these timescales they claim that the traffic is consistent with multi-fractal 
scaling, but at larger timescales (larger than the typical RTT on the network 
being investigated) the traffic looks self-similar. 

4p| compares the scaling behaviour of aggregated fractional Brownian 



motion processes with that of real traffic and concludes that it is not a good 
match and therefore suggests multifractal behaviour may be necessary to 
provide a good fit to real traffic traces. 

In 41[ the relationship between wide-area traffic correlation and link util- 
isation is explored at different timescales. They find that at small timescales 
burstiness can impact on performance at low and intermediate utilisations, 
while correlations at larger timescales are more significant at intermediate 
and high utilisations. 
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The paper 42j analyses backbone traffic traces at multiple Tier 1 links 
and investigates its behaviour at small scales (less than one second). The 
presence of correlation at small timescales is attributed to the characteristics 
(and not the number) of the aggregated flows and affected by the presence of 
dense flows characterised by bursts of clustered packets. They conclude that, 
at small timescales, traffic has mainly a monofractal behaviour and even that 
the traffic is "almost independent" - this is a contrast to much of the other 
work discussed in this section. 

The paper 43| sheds more light as regards to the possible causes of traffic 
correlation at sub-RTT timescales on backbone links. This paper confirms 
the findings in [42j but in addition suggests that these clusters of bursts 
derive from TCP self-clocking mechanism and queuing delays. These cluster 
of bursts are produced by flows with large bandwidth-delay product relative 
to their window size. 



Internet backbone traffic dated 2002-4 is also analysed in [28] with the 
conclusion that the Poisson distribution can adequately model packet arrivals 
at smaller timescales (the threshold where behaviour changes from Poisson 
to LRD varies but is around 1000ms). It confirms the existence of LRD in 
packet and byte counts at timescales larger than a second. 

In [441, the author considers the rate of TCP flow arrivals rather than 
the total traffic on a link. Several traces are investigated, collected between 
1993 and 2002. The analysis finds different scaling behaviours over a range 
of timescales and concludes that the flow arrivals are uncorrelated at the 
smallest timescales, correlated at timescales between seconds and minutes 
and consistent with "LRD or self-similarity between minutes and hours" but 
non- stationary time-of-day behaviour prevails at longer time scales. 

In (ill] the authors argue using analysis of several traces (taken between 
1989 and 2002) that at longer timescales LRD is an appropriate model and at 
shorter timescales they refer to the behaviour as "pseudo-scaling" , a process 
which gives the appearance of multifractality but "which does not have true 
multifractal scaling underlying it" - in other words that the multifractal 
scaling observed by earlier authors is unnecessary. The scale at which the 
behaviour changes differs according to the trace being examined. 

In summary, consensus seems to have formed that LRD behaviour pre- 
dominates when traffic is considered at a larger timescales (at least until the 
user related non-stationarity disturbs the observation). However, the shorter 
timescale behaviour is a matter for much debate with some authors suggest- 
ing something as simple as a Poisson model is adequate, others suggesting 
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multi-fractal models are necessary and many taking positions in between 
these extremes. 



2-4- Possible causes of long-range dependence in network traffic 

The origins of power law behaviour in network traffic have not been un- 
equivocally identified. Several possible causes for the presence of long-range 
dependence have been proposed in the literature. 

Heavy-tailed distributions. A common suggestion is that the heavy- 
tailed nature of data transfers leads to LRD in the resultant traffic (see [231 . 
Theorem 4.3]) Simulation studies have confirmed this experimentally: [34j, 
for example, found it to hold over a range of link bandwidths and a range of 
buffer sizes. Also in 46j simulations show that heavy-tailed file sizes lead to 
self-similarity on large timescales but also that the delay behaviour interacts 
with the TCP feedback mechanism to greatly alter the structure of the traffic 
at shorter time scales. 

TCP protocol. It has been argued f47!f that TCP congestion control 
alone can cause self-similarity regardless of the application layer traffic char- 
acteristics. This argument however is contested in 48| which looks at the 



same data but over longer time scales and finds that it is not consistent with 



power law behaviour. Also [49[, by shuffling network samples, reordering 
traffic, and removing the effects of TCP mechanisms while leaving the effects 
of heavy-tailed traffic, is able to show that it is heavy-tailed traffic rather 
than TCP feedback mechanisms which leads to long-range dependence. 
It has been suggested 



50 



using evidence based upon Markov modelling 
that the TCP timeout mechanism can lead to "local long-range dependence" 
which they also refer to as "pseudo self-similarity", that is to say, self- 
similarity over a small number of timescales (note that this is not true self- 
similarity) . 



The proposal in [51 



suggests that TCP retransmission mechanism can 
give rise to self-similarity. Also 34 1 concludes that TCP can preserve long- 
range dependence over time while 52| suggests that TCP can preserve cor- 
relation over space. 

Queuing/ Routing effects. Another possibility is that power law traffic 



arises as a result of the interaction of queues and routing on a network 53 



Simulation experiments shown that even when "packet inter-departure times 
are independent, arrival times at the destination exhibit LRD", perhaps as 



a result of the routing algorithms [54j, [55 
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Multi-layers and timescales. There is also the possibility that long- 
range dependence simply arises because of the combination of processes oc- 
curring at different timescales: user's activity, session, and transmission pro- 
cesses. In fact, it can be seen that even under the assumption of Poisson 
distribution for all usage, session, and transmission processes, the mere pres- 
ence of multiple layers may lead to correlated traffic 56 . 

Intrinsic traffic nature. It has long been known that some types of 
traffic exhibit LRD at the source. For example, variable bit rate video traffic 
deriving from a single flow shows LRD in a time series of traffic 57 . 

While no clear consensus has yet formed, many of the authors cited in 
this and the previous two sections agree that heavy-tails are the cause of the 
LRD observed in larger time scales. No consensus seems yet to have been 
reached on the behaviour of traffic at shorter time scales and this remains an 
important topic for traffic research. The lack of consensus in this is reflected 
in the number of possible causal models for the short timescale behaviour. 



2.5. Effects on queuing 

The effects of long-range correlated traffic on buffer dimensioning have 
been analysed by means of appropriate queuing models developed in [58l. 159. 



60l . l6ll |62| among others. These models apply to infinite buffers and only 
provide asymptotic results. Under the assumption of infinite length buffer 
and long-range dependent input traffic the main finding is that the distribu- 
tion of queue length has slower than exponential decaying tail, as opposed to 
exponential observed for short-range dependent traffic. This decaying func- 
tion has instead been described by other distributions such as for example a 
WeibuU ' 



and polynomial [61 



In the case of finite buffer systems, it has been suggested that, in a net- 
work with long-range dependent traffic, the packet loss ratio is several orders 
of magnitude higher than with short-range dependent traffic [63|]. The packet 
loss ratio could only be contained by choosing very large buffers which would 



have an impact on queuing delay [63|]. However, in the mid-nineties other au- 
thors cast doubt on the usefulness of power law models of Internet traffic, by 
questioning the importance of capturing traffic long-range dependence in the 
case of finite buffers 6J, |65j . They argue that correlation becomes irrelevant 



for small buffers and short timescales. 
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2.6. Traffic generation models 

A variety of mathematical models have been suggested in the literature 
to capture the LRD in Internet traffic. For a comprehensive review of these 
models the reader is referred to [211 ] . Only a short summary is provided here. 

Fractional Brownian motion (fBm) is a non- stationary stochastic process 
which is a generalisation of the well-known Brownian motion, but with a 
dependence term between samples. It is a self-similar process and has a 
defined Hurst parameter H, with the Brownian motion obtained for H = 1/2. 
If Bfj{t) denotes the fBm then the difference process defined as Yfc(t) = 
B H (t + k) — B H (t) with H e (1/2, 1) is the fractional Gaussian noise(fGn) 
which is long-range dependent. Several methods exist for generating a fGn 
process, for example [661 ] . 

Although fGn is mathematically attractive its simplicity means that it 
cannot capture a diversity of mathematical properties. The queue length 
distribution obtained with a fGn process decays according to the Weibull or 
"stretched exponential" distribution, which is heavy-tailed only in a weak 



sense 
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Fractional Auto-Regressive Integrated Moving Average (FARIMA) [1| 
pages 59-66] models are an expansion of the classic time-series ARIMA mod- 
els and allow modelling of long and short range dependence simultaneously 
and independently. 

Long-range dependence can also be generated by using chaotic maps as 



first proposed by Erramilli and Singh [68(. However, modelling based on 
chaotic maps requires considerable experimentation, as these are very sen- 
sitive to initial conditions and their many parameters' estimation is often a 



complex task |56j. The queue length distribution obtained with a chaotic 



maps family has been found to decay according to the Weibull distribution 
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Another model is based on the superposition of heavy-tailed on/off sources 
251 ]. The process obtained by multiplexing many on/off sources with heavy- 
tailed distributions tends to a fGn process. 

Finally, another technique for modelling traffic is by means of Wavelet 



analysis 70]. This allows not only capturing the Hurst parameter but also 



synthesising a wide range of scaling behaviours and the replication of the 



multi-fractal spectrum [7l|, |38 



An important criticism of these models is in their replication of queuing 
behaviour. While much work has been done to show that the models replicate 
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certain representative traffic statistics, one of the primary motivations cited 
for using LRD models of queuing is estimating delays and buffer overflow 
probabilities. These models have not been shown to do this well, indeed while 
it has been shown that some mathematical models of LRD have very different 
queuing behaviour to non LRD versions of those models, it remains to be 
shown that LRD is necessary to replicate the queuing and delay performance 
of real traffic. 



2. 7. Criticisms and commentary 

Although the majority of papers appear to replicate the finding that LRD 
is present in network traffic, some have questioned whether other models are 



more appropriate (for example multi- fractal models [39l. |33| ) . Multifractals 
are in fact able to model varying scaling behaviour over different timescales, 
as they are characterised by a time dependent scaling coefficient. In addition, 
LRD appears at long timescales which are more relevant for network dimen- 
sioning and less for queuing behaviour. Others have also questioned whether 
LRD may be unimportant in practice, for example due to multiplexing gains 
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Consensus seems to be forming on the origin of LRD behaviour (as dis- 
cussed in section 12 .4[) although some controversies remain. As regards to 
its effects, papers in the area often focus on the fact that LRD may impact 
on network performance by increasing delays or increasing the packet loss 
expected for a given buffer size. However this relationship is not a simple 



one and the presence of LRD does not always have a negative impact [27 
If a cause were unequivocally established the question would remain, "how 
might we go about eliminating LRD from the network given this cause?" If 
heavy-tailed file transfers are the cause then no clear method for resolving 
the problem is obvious. However, if TCP feedback mechanisms are a cause 
it would be difficult to change this without changing the protocol itself. 

In order to understand the usefulness of power laws for practical studies, 
an important question to ask is whether LRD models generate traffic with the 
same queuing properties as real Internet traffic. If the models from Section 
12. 61 are to be useful then, when correctly tuned to the parameters of a genuine 
packet trace, they should have the same mean delay and buffer overflow 
probabilities as the genuine traffic. Huebner et al 72J tested a Poisson model, 
a Weibull model, an autoregressive (AR(1)) model, a Pareto model, and a 
Fractional Brownian Motion model for generating traffic. None of the models 
tested produced a good match for queuing performance in all circumstances. 
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The fBm model was useful only when the buffer size considered was large. 



Similarly, [73( tests the queuing performance of fBm and three other LRD 



models based on Markov modulated processes as well as some non LRD 
models. The models are tuned so that their parameters (mean and Hurst 
parameter) match real network traffic and the queuing performance of each is 
tested in an infinite buffer simulation. In this case, none of the traffic models 
replicated the queuing performance of the real traffic and the LRD models 
often showed different performance from each other despite having the same 
mean and Hurst parameter. Of course, even if a model could be found which 
accurately reproduced a given queuing behaviour obtained with real traffic, 
this would not solve the entire problem since the statistical nature of Internet 
traffic arises at least in part from TCP feedback mechanisms, which in turn 
depends upon potentially changing traffic levels and congestion. 

Theoretically, some interesting queuing theory results for systems with 
LRD input traffic have been achieved but these results are often asymptotic 
results for infinite buffer models. How applicable these would be in practical 
situations remains an open question although, of course, it may be hoped 
that future theoretical results will build on them. 

Several questions therefore remain about LRD. Which LRD model, if any, 
is appropriate to generate traffic which has similar delay and buffer overflow 
probabilities to real Internet traffic when queued? Can future networks be 
designed to mitigate the potentially deleterious effects on performance which 
are said to result from LRD? Can analytical models be developed which 
give strong enough results to be practically applicable to real traffic on real 
networks? 



3. Modelling Internet topology 

Topology is the connectivity graph of a network, upon which the network's 
physical and engineering properties are based. The Internet contains millions 
of routers, which are grouped into tens of thousands of sub-networks, called 
Autonomous Systems (AS). The Internet topology can be studied at the 
router level and the AS level. Studies of the Internet topology very much 
depend on the availability and quality of measurement data. In the last 
decade a number of projects have provided more and more complete and 
accurate data on the Internet AS connectivity. By comparison it is more 
difficult to obtain router level data. So far there are more studies on the 
AS-level Internet topology than on the router-level. 
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In this paper the Internet topology is considered only at the AS level, 
in which a node is an AS network owned by an entity with a large Inter- 
net presence, such as an ISP or a large company; and a link represents a 
peering relationship between two AS nodes in the border gateway proto- 
col (BGP) [74j]. Research on the structure and evolution of the Internet AS 
graph is relevant because the delivery of data traffic through the global Inter- 
net depends on the complex interactions between AS that exchange routing 
information using the BGP protocol. 

3.1. Measuring Internet topology 

Measurements of the Internet AS graph have been available since the late 
1990s. There have been two types of measurements using different method- 
ologies and data sources. 

Passive measurements are constructed from BGP routing tables which 
contain information about links from an AS to its immediate neighbours. 



The Routing Information Service of RIPE [75j is another important source 
of BGP data. The widely used BGP AS gra phs are produced by the National 
Laboratory for Applied Network Research |76| and the Route Views Project 



at the University of Oregon [77]]. They are connected to a number of opera- 



tional routers on the Internet for the purpose of collecting BGP tables. The 



Topology Project at the University of Michigan [78| has provided an extended 



version 



79] of the BGP AS graph by using additional data sources, such as 
the Internet Routing Registry (IRR) data and the Looking Glass (LG) data. 
BGP-based AS measurements may contain links that do not actually exist 
in the Internet, but a more serious problem is that the BGP measurements 



may miss a significant number of links [80f . 

Active measurements are based on the traceroute tool which sends probe 
packets to a given destination and captures the sequence of IP hops along the 
forward path from the source to the destination. The Internet research organ- 



isation CAIDA |81J has developed a tool called skitter which probes around 
one million IPv4 addresses from 25 monitors around the world. Using the core 
BGP tables provided by RouteViews, CAIDA maps the IP addresses in the 
gathered traceroute data to AS numbers [82j and constructs AS graphs on a 



daily basis. DIMES [83| is a more recent large-scale distributed measurement 
effort. It collects traceroute data by probing from more than 10, 000 software 
clients, installed by volunteers in over 90 countries, to destinations assigned 
by a central server at random from a set of five million destination addresses. 
To further improve the completeness, DIMES merges the resulting AS graph 
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with that of RouteViews. By using more monitors and a larger list of distinct 
addresses, DIMES produces larger AS graphs than skitter. The shortcoming 
of the traceroute measurements is that the translation from IP addresses to 
AS numbers is not trivial and could introduce many errors 
increasingly, firewalls block the probe packets. A recent study 



84] and also, 
851 ] suggested 



that traceroute measurements should probe destinations more frequently and 
avoid using a fixed list of destination addresses. 

3.2. Power law degree distribution 

In graph theory, degree k is defined as the number of links, or immediate 
neighbours, of a node. Degree is the principal parameter for characterising 
network connectivity. The first step in describing and discriminating between 
different networks is to measure the degree distribution P(k), which is the 
probability of finding a node with degree k. In 1999 it was discovered that 
the Internet topology at the AS level (and the router level) exhibits a power 
law degree distribution P{k) ~ C/c~ 7 |5|, where C > is a constant and 
the exponent 7 ~ 2.2 ± 0.1. This means on the Internet AS graph, a few 
nodes have very large numbers of links, whereas the vast majority of nodes 
have only a few links. Although different Internet AS graphs produced from 
different data sources vary in the numbers of nodes and links, all the Internet 



AS graphs are well characterised by a power law degree distribution [86| . The 
power law distribution is an evidence that the Internet AS level topology has 
evolved into a complex, heterogeneous structure that is profoundly different 
from Internet models based on the random graph theory. This discovery 
profoundly changed the understanding of Internet topology. Since then there 
has been an international effort in characterising and modelling the Internet 
topology. 

3.3. Power law or sampling bias? 

A major problem of current measurements of the Internet AS graph 
is that these measurements, whether based on BGP, traceroute or other 



sources, miss a significant number of links [79j, |87|, 180] . Some researchers 
suggested [i?], 8(| that there could be as many as 35% of the links in the 
AS level Internet that were still to be discovered. A series of papers 88|, [89 
reported that the traceroute type of measurement data collected from a small 
number of observers are not only incomplete but are possibly biased in such 
a way that graphs which in fact have Poisson degree distributions appear to 
exhibit a power law. There has been a debate on whether the power law 
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degree distribution an integral property of the Internet AS graph or merely 
an artifact due to biased sampling methods. 

There are two sides of the argument. Many researchers believe that the 
power law is an integral property of the Internet. Firstly all Internet AS 
graph measurements exhibit a power law degree distribution including the 
DIMES data which are collected from numerous observers distributed in 
thousands of AS networks around the world, as well as the BGP AS graph 
based on routing table data collected from many monitors and accumulated 



over many years. Secondly, a recent study [90| shows that if the larger real 
graph had a Poisson degree distribution and the observed power law were 
due to sampling bias, then the real graph's average degree would be very 
large. In the case of the AS network the true average degree would have to 
be around one hundred. The observed average degree in the known sources is 
between five and seven so if this model were true it would require the unlikely 
proposition that less than one in ten edges have been observed. Surely this 
can not be true. 

On the other hand, there are also many researchers who are sceptical 
about the power law degree distribution. Firstly, the visibility of the AS 
graph can be influenced to a great extent by which vantage points are used, 



not by how many. Secondly the analysis in |90j rejects the claim that the real 
AS graph may follow a Poisson degree distribution, but the real question is 
whether the Internet AS graph is characterised by a power law distribution 
or a different heavy-tail distribution which does not follow a power law. 

Only better measurement data can settle this issue. The current situa- 
tion is that all measurements are incomplete and bias in one way or another. 
There is an urgent need for improved methods to produce more complete 



and accurate data. A recent effort towards this direction is [85[ which inves- 
tigates both the completeness and the liveness problems in the measurement 
of Internet AS graph evolution. 

3.4- Structures beyond the power law 

Degree distribution is a first-order topological property which is based on 
the connectivity information of individual nodes. When studying the Internet 
structure, it is important to look beyond the power law degree distribution 
because networks with exactly the same power law degree distribution can 



have completely different high order properties [91|, [92|, [93|, |94| . 

High order properties are calculated on the connectivity information of a 
pair, a triad or a set of nodes. High order properties are able to explicitly 
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determine lower order properties whereas the later only constrain the for- 
mer. Researchers have introduced many high order topological properties, 
each of which has a distinct physical meaning, for example the degree-degree 
correlation 95|, |96|, |97|, |98| which indicates whether high-degree nodes tend 



to connect with high-degree nodes (so-called 'assortative mi xing') o r low- 



99, 941 which 



degree nodes ('disassortative mixing'); the rich-club coefficient 
quantifies how tightl y the best connected nodes connect with themselves; the 
clustering coefficient 100| which measures the fraction of a node's neighbours 
which are neighbours to each other; the average shortest path which i s the 



average hop distance between any two nodes; the fc-core decomposition [101 



which reveals a network's underlying hierarchical structure; and the between- 
ness which measures how often a node or a link is on the shortest (fewest 
hop) path between two nodes. 

The Internet topology can be describes a jellyfish 1021 ]. where a highly 
connected core is in the middle of the cap, and one-degree nodes form its 
legs. This intuitive model is simple yet very useful as it concisely illustrates 
a number of important properties of the Internet, including the dense core 
(rich-club) and the large number of low degree nodes (power law) which are 
directed connected with members of the core (disassortative mixing). The 
Internet has a small average distance between any two nodes because the 
rich-club functions as a 'super' traffic hub which provides a large selection of 
shortcuts for routing and the disassortative mixing ensures that the majority 
of network nodes, which are peripheral low-degree nodes, are always near the 
rich-club. 

Our knowledge and understanding of the Internet topology have been 
improved significantly in recent years. However, it is still profoundly difficult 
to define the Internet topology and there are many unanswered questions: 
What are the key properties that fundamentally characterise the Internet 
topology? How do these properties relate to each other? What is the role 
each property pla ys on the network's function and performance? 

It is suggested |l03 . 104 ] that for the Internet, the second order properties 
are sufficient for most practical purposes; while the third order properties 
essentially reconstruct the Internet AS and router level topologies exactly. 
A recent work 94J pointed out that for the Internet the degree distribution 
and the rich-club coefficient restrict the degree-degree correlation to such a 
narrow range, that a reasonable model for the Internet can be produced by 
considering only the degree distribution and the rich-club coefficient. Note 
that although these studies provide new clues on how to choose topological 
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properties for consideration in modelling the Internet topology, they do not 
constitute a 'canonical' set of metrics that are most relevant for the network's 
function and performance. 

3.5. Modelling Internet topology 

Since the discovery of the power law degree distribution, a n umbe r of 
models h ave been proposed to generate Internet-like graphs 



106l.[l07|. Models from networkin g co mmunity, such as Tier, BRITE jl08| . 



GT-ITM (Transit-Stub) and Inet 109] , often suffer from problems of no (or 
an incorrect) power law, inaccurate large-scale hierarchy, requiring parameter 
estimation or providing a mec hani sm fo r network evolution; and models from 
physicists jllO . 111 . 112 . 113 . 13, 114] also have problems as they often are 
too general and do not incorporate any real network specifics. 

In general ther e are two main approaches for generating topologies of 
complex networks 1151 ] . The equilibrium (top-down) approach is to con- 



struct an ensemble of static random graphs reproducing certain properties of 
observed networks and then to derive their other properties by the standard 
methods. The non-equilibrium approach (bottom-up) tries to mimic the ac- 
tual dynamics of network growth: if this dynamics is accurately captured, 
then the modelling algorithm, when let to run to produce a network of the 
required size, will output the topology coinciding with the observations. It 
is clear that the more ambitious non-equilibrium approach has the potential 
to hold the ultimate truth. Classic examples of this approach include the 
Barabasi- Albert (BA) model flTol ] and the HOT model @. Many models 
owe their origins to the preferential attachment approach where new links 
attach to nodes with a probability proportional to the degree of that nod e. 

The Positive-Feedback Preference (PFP) model proposed in 2004 1 161 ] is 
an example of the non-equilibrium models for the Internet. The model is an 
extensive modification of the BA model. It is able to reproduce a large num- 
ber of characteristics (includ ing all topolog ical properties mentioned above) 
of the Internet AS topology jl!7l . 118 . 119| . It uses t wo growth me chanisms 



inspired by observations on the Internet history data [95j, ]120|, Il2l| . Firstly, 



the model starts from a small random graph and grows by two coupled actions 
called interactive growth, i.e. the attachment of new nodes to old nodes in 
the existing system and the addition of new links between these old nodes to 
other old nodes. This resembles the dynamics that when an Internet service 
provider (ISP) acquires new customers it reacts by increasing its number of 
connections to peering ISPs. Secondly, the preference probability that node 
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% acquires a new link (from a new node or a peer) is given as a function of 
the node's degree k iy 



U(i) 



k 



l+51og 10 ki 



5 = 0.048. 



(3) 



J 3 



This is called the positive-feedback preference, which means a node's ability 
of competing for a new link increases more and more rapidly with its growing 
number of links, like a positive-feedback loop. The consequence is that 'the 
rich not just get richer, they get disproportionately richer'. This mechanism 
resembles the 'winner-takes- all' trend in the Internet development. More re- 
cently Chang et al [15( proposed another bottom-up approach for generating 
Internet AS graph, where the Internet evolutionary process is modelled by 
identifying a set of criteria that an AS considers either in establishing a new 
peering relationship or in reassessing an existing relationship. 

3.6. Practical responses to Internet power law modelling 



It is suggested [122| that the Internet power law structure is relevant to a 
number of issues, such as the severely bi ased distribution of traffic flow, the 
slow convergence of BGP routing tables 123| an d the large-scale cascading 
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size 



for a network 



failure caused by incidents or deliberate attacks [124J. As such, the power 
law property also provides novel insights into the solutions of these problems. 
For example it is shown that the power law property makes it possible to 
mitigate the distributed denial of service (DDoS) attacks by implementing 
route-based filtering on less than 20% of AS |l2fj ; a compact routing scheme 
base d on the power law property requires a significantly smaller routing table 
and the power law property is relevant to the epidemic threshold 

Albert et al 1281 ] have reported that scale-free networks, i.e. networks 
having power law degree distributions, are robust to random failures but 
fragile to targeted attacks. This widely publicised work has generated a 
wave of studies on the robustness of various networks. This work, however, 
has generated some confusion in the Internet community. It should be noted 
that the Internet is much different from the generic BA model used in that 
study. Firstly the Internet AS topology does not follow a strict power law 
as in the BA model and the Internet's high order topological properties are 
also significantly different from the BA model. Secondly it is unrealistic, if 
possible at all, to 'attack' an AS node, i.e. to wipe out an entire AS network 
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and cut off all its connections with other networks. This is because an AS 
node can represent a network of thousands of routers which can spread across 
a number of continents. And finally it is important to realise that links 
on the Internet AS graph can represent different commercial relationships 
between AS networks, such that a 'path' of adjacent links between AS nodes 
on the Internet AS graph does not necessarily imply routing 'reachability' 
between the two AS. For example a customer AS does not transit traffic for 
its providers. 

3. 1. Criticisms and commentary 

The discovery of a power law degree distribution in the Internet topology 
has attracted a huge amount of attention and there have been tremendous 
efforts to measure, characterise and model the Internet topology. Recent de- 
bate suggested that whether the power law degree distribution is an integral 
property of the Internet is still an open question. It is vital for researchers 
to look beyond the power law property and appreciate high order properties 
of the Internet topology. 

There are generative models which well reproduce the Internet topology 
as a pure graph. However the reachability between two AS nodes on the 
Internet is not only affected by the underlying connectivity graph, but also 
constrained by many other factors, such as routing policies, capacities, demo- 
geographic distributions and local structures. Future Internet models should 
more closely reflect the Internet specifics in order to produce practically useful 
results. 



As pointed out in [129(, there is a need for more interdisciplinary com- 
munication among computer scientists, mathematicians, physicists and engi- 
neers. Such communication is much needed to facilitate the interdisciplinary 
flow of knowledge and enable the network research community to convert the- 
oretical results into more practical solutions that matter for real networks, 
e.g. performance, revenue and engineering. 

4. Conclusions 

An obvious question arising from this paper is whether there is a connec- 
tion between the power law topology and the power laws observed in traffic 
levels. One likely mechanism for such an interaction would come from con- 
sidering how traffic aggregates a s a r esult of the topology. A starting point 



might be the work reported in [130| which combines power law topologies 
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with simulations involving LRD sources. Further research in this area might 
well be fruitful. 

Most authors agree that power law relationships are present in measure- 
ments of network traffic. Measurements of file size transfers appear consistent 
with a heavy-tailed distribution. Measurements of traffic levels per unit time 
and packet inter-arrival times fit the hypothesis of LRD. However, this only 
describes the long time-scale behaviour (at least below the time-scale where 
day-to-day non-stationarity affects measurement). The behaviour of network 
traffic at shorter time scales is still an open question with different authors 
proposing different models. On the origin of long-range dependence, con- 
sensus appears to have formed that heavy-tailed distribution of file sizes is 
the major cause but with alterations to the short term behaviour arising 
from TCP protocol interactions. However, some authors give other explana- 
tions for the short term behaviour and the matter cannot yet be said to be 
definitively settled. 

While many models have been proposed which generate traffic with the 
appropriate power law behaviour, it remains to be shown which of these, 
if any, best fits real traffic traces. In particular, if LRD is of relevance for 
queuing and buffer behaviour, it is key that the model selected replicates the 
queuing performance of the real traffic and this is an important shortcoming. 
The models proposed to describe queuing behaviour with long-range depen- 
dent input traffic suggest that the tail of the queue occupancy distribution 
decays slower than exponentially. If the study of power laws is to result in 
a positive effect on network traffic engineering then: 1) it is important to 
find a power law based traffic generation model which replicates the queuing 
performance of the real traffic. 2) progress needs to be made in ways to either 
mitigate the effects of LRD or to plan a network by allowing for it. 

Researchers have also made progress on measuring and modelling the 
Internet topology at the AS-level. More complete and accurate measure- 
ment data are needed to justify whether the power law degree distribution 
is indeed an integral property of the Internet. Much more research work 
is needed, for example, to identify the key topological properties that fun- 
damentally characterise the Internet structure and to include the Internet 
specifics in topology models. It is encouraging that the power law modelling 
of Internet topology have begun to stimulate research which takes advantage 
of this network structure. There is an increasing recognition that effective en- 
gineering of the global Internet should be based on a detailed understanding 
of issues such as the large-scale structure of its underlying physical topology, 
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the manner in which it evolves over time, and t he w ay in which its constituent 



components contribute to its overall function [131 . 

In summary, for the research in power laws to truly have an engineering 
impact on the Internet, reliable and calibrated models are needed which 
match the characteristics of real data. It could be argued that there has 
been a certain level of success for topology generation but certainly not for 
traffic generation. The models should be capable of application as a design 
tool to allow engineers to improve real life network performance. As yet, this 
stage of research appears elusive in both fields. 
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